Chair: Roman Schneider, Gertrud Faaß

SIG Text Technology is concerned with the integration of markup languages and linguistic data processing. The goal is to enable the development of innovative text models and content-oriented word processing and usage. Our main focus is on the processing of German-language texts and innovative text types (e.g. [song lyrics[(http://songkorpus.de/)]); the claim also relates to language varieties, socio- and regiolects and the contrastive examination of less-researched languages, which may require a review and extension of existing standards.

For the detection of differences between these languages or language variants and already well-studied languages, i.e. for contrastive research, the compilation of suitable data (corpora) seems necessary. Here the SIG takes care of the definition of relevant metadata categories, and supports the creation and documentation of relevant gold standards.

JLCL Vol 36 (1)

A special edition of the Journal of Language Technology and Computational Linguistics, JLCL 36(1), with the theme Challenges in Computational Linguistics, Empiric Research & Multidisciplinary Potential of German Song Lyrics (edited by Roman Schneider and Gertrud Faaß) is now online and accessible here.

Corpus Linguistics

The working group addresses the development and testing of tools for the automatic analysis of corpora as well as the construction and application of mathematical, quantitative models of explorative corpus analysis.

The working group addresses the following questions:

  • Preparation and annotation of corpora.
  • Body analytic based metrization of properties and relations of linguistic units.
  • Extraction, reconstruction or exploration of linguistic knowledge from corpora of natural language texts.
  • Promotion of applications in the field of text analysis and text technology.
  • Support of linguistic theories.